Optimizing shared cache behavior of chip multiprocessors

机译：优化芯片多处理器的共享缓存行为

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

One of the critical problems associated with emerging chip multiprocessors (CMPs) is the management of on-chip shared cache space. Unfortunately, single processor centric data locality optimization schemes may not work well in the CMP case as data accesses from multiple cores can create conflicts in the shared cache space. The main contribution of this paper is a compiler directed code restructuring scheme for enhancing locality of shared data in CMPs. The proposed scheme targets the last level shared cache that exist in many commercial CMPs and has two components, namely, allocation, which determines the set of loop iterations assigned to each core, and scheduling, which determines the order in which the iterations assigned to a core are executed. Our scheme restructures the application code such that the different cores operate on shared data blocks at the same time, to the extent allowed by data dependencies. This helps to reduce reuse distances for the shared data and improves on-chip cache performance. We evaluated our approach using the Splash-2 and Parsec applications through both simulations and experiments on two commercial multi-core machines. Our experimental evaluation indicates that the proposed data locality optimization scheme improves inter-core conflict misses in the shared cache by 67% on average when both allocation and scheduling are used. Also, the execution time improvements we achieve (29% on average) are very close to the optimal savings that could be achieved using a hypothetical scheme. Copyright 2009 ACM.

机译：与新兴芯片多处理器（CMP）相关的关键问题之一是片上共享缓存空间的管理。不幸的是，以单处理器为中心的数据局部性优化方案在CMP情况下可能无法很好地工作，因为来自多个内核的数据访问会在共享缓存空间中产生冲突。本文的主要贡献是一种针对编译器的代码重组方案，用于增强CMP中共享数据的局部性。拟议的方案针对许多商业CMP中存在的最后一级共享缓存，它具有两个组件，即分配和调度，调度分别确定分配给每个核心的循环迭代集，而调度确定分配给每个核心的迭代顺序。核心被执行。我们的方案对应用程序代码进行了重组，以便在数据依赖关系允许的范围内，不同的内核同时对共享数据块进行操作。这有助于减少共享数据的重用距离，并提高片上缓存性能。我们通过在两台商用多核计算机上进行仿真和实验，使用Splash-2和Parsec应用程序评估了我们的方法。我们的实验评估表明，当同时使用分配和调度时，所提出的数据局部性优化方案可使共享缓存中的内核间冲突丢失平均降低67％。同样，我们实现的执行时间改进（平均29％）非常接近使用假设方案可以实现的最佳节省。版权所有2009 ACM。

著录项

作者
Kandemir, M.; Muralidhara, S.P.; Narayanan, S.H.K.; Zhang, Y.; Ozturk O.;
展开▼
作者单位

展开▼
年度 2009
总页数
原文格式 PDF
正文语种 English
中图分类

相似文献

外文文献
中文文献
专利

1. Adaptive Set Pinning: Managing Shared Caches in Chip Multiprocessors [J] . Shekhar Srikantaiah, Mahmut Kandemir, Mary Jane Irwin Computer architecture news . 2008 ,第1期

机译：自适应集固定：管理芯片多处理器中的共享缓存
2. Adaptive set pinning: managing shared caches in chip multiprocessors [J] . Shekhar Srikantaiah, Mahmut Kandemir, Mary Jane Irwin ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2008 ,第3期

机译：自适应集固定：管理芯片多处理器中的共享缓存
3. An LRU-based Replacement Algorithm Augmented with Frequency of Access in Shared Chip-Multiprocessor Caches [J] . Haakon Dybdahl, Per Stenstroem, Lasse Natvig Computer architecture news . 2007 ,第4期

机译：共享芯片多处理器高速缓存中基于访问频率增强的基于LRU的替换算法
4. Optimizing shared cache behavior of chip multiprocessors [C] . Mahmut Kandemir, Sai Prashanth Muralidhara, Sri Hari Krishna Narayanan, Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture . 2009

机译：优化芯片多处理器的共享缓存行为
5. Dynamic cache reconfiguration for energy optimization in chip multiprocessors. [D] . Puri, Gaurav. 2012

机译：动态高速缓存重新配置，用于芯片多处理器中的能源优化。
6. Single‐nucleotide polymorphisms in cachexia‐related genes: Can they optimize the treatment of cancer cachexia? [O] . Junichi Ishida, Masakazu Saitoh, Jochen Springer 2017

机译：恶病质相关基因中的单核苷酸多态性：它们能否优化癌症恶病质的治疗？
7. ACM: An Efficient Approach for Managing Shared Caches in Chip Multiprocessors [O] . Hammoud Mohammad, Cho Sangyeun, Melhem Rami 2009

机译：ACM：一种用于管理芯片多处理器中共享缓存的有效方法

Optimizing shared cache behavior of chip multiprocessors

摘要

著录项

相似文献

相关主题

期刊订阅